{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Point Cloud Based 3D Object Detection Inference using ONNX Runtime\n",
    "\n",
    "\n",
    "## Inference using Pre-Compiled Models\n",
    "In this example notebook, we describe how to use a pre-trained Point Cloud Based 3D Object Detection model for inference using the ***ONNX Runtime interface***.\n",
    "   - Model is based on point pillars [Link](https://arxiv.org/abs/1812.05784) method.\n",
    "   - Model is trained for only one class, which is 'car'\n",
    "   - We perform inference on a few sample point clouds\n",
    "   - We also describe the input preprocessing and output postprocessing steps, demonstrate how to collect various benchmarking statistics and how to visualize the data.\n",
    "\n",
    "\n",
    "## Point Cloud Based 3D Object Detection  \n",
    "\n",
    "3D Object Detection is a popular computer vision algorithm used in many applications such as Person Detection and Vehicle detection. 3D object information provides better understanding surrounding and hence helps in precise path planning. 3D object detection on lidar data outperforms than image data. \n",
    "\n",
    "## ONNX Runtime based Work flow\n",
    "\n",
    "The diagram below describes the steps for ONNX Runtime based workflow. \n",
    "\n",
    "Note:\n",
    "- The user needs to compile models(sub-graph creation and quantization) on a PC to generate model artifacts.\n",
    "    - For this notebook we use pre-compiled models artifacts\n",
    "- The generated artifacts can then be used to run inference on the target.\n",
    "- Users can run this notebook as-is, only action required is to select a model. \n",
    "\n",
    "<img src=docs/images/onnx_work_flow_2.png width=\"400\">"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "parameters"
    ]
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import cv2\n",
    "import numpy as np\n",
    "import ipywidgets as widgets\n",
    "from scripts.utils import get_eval_configs\n",
    "\n",
    "prebuilt_configs, selected_model_id = get_eval_configs('detection_3d','onnxrt', \n",
    "                                                       num_quant_bits = 8, \n",
    "                                                       last_artifacts_id = None,\n",
    "                                                       experimental_models=True)\n",
    "display(selected_model_id)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f'Selected Model: {selected_model_id.label}')\n",
    "config = prebuilt_configs[selected_model_id.value]\n",
    "config['session'].set_param('model_id', selected_model_id.value)\n",
    "config['session'].start()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create the model using the stored artifacts\n",
    "<div class=\"alert alert-block alert-warning\">\n",
    "<b>Warning:</b> It is recommended to use the ONNX Runtime APIs in the cells below without any modifications.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import onnxruntime as rt\n",
    "\n",
    "onnx_model_path = config['session'].get_param('model_file')\n",
    "delegate_options = {}\n",
    "so = rt.SessionOptions()\n",
    "delegate_options['artifacts_folder'] = config['session'].get_param('artifacts_folder')\n",
    "\n",
    "EP_list = ['TIDLExecutionProvider','CPUExecutionProvider']\n",
    "sess = rt.InferenceSession(onnx_model_path ,providers=EP_list, provider_options=[delegate_options, {}], sess_options=so)\n",
    "\n",
    "input_details = sess.get_inputs()\n",
    "output_details = sess.get_outputs()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sample Point Cloud data\n",
    "Prepare the list of point cloud data to be processed. Similar list has to prepared for image data and clibration data. \n",
    "    \n",
    "\n",
    "### Disclaimer ::\n",
    "  - We use one sample point cloud data, corresponding image data and calibration information from https://github.com/azureology/kitti-velo2cam/blob/master/readme.md. Currently only one lidar frame is hosted there. This data is used in main processing loop.\n",
    "  - This point cloud is from Kitti data set, and user of this jupyter notebook is assumed to have agreed to all the terms and conditions for usages of this dataset content. \n",
    "  - Refer \"http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d\" for license of usages for Kitti 3d-od dataset.\n",
    "  \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "    \n",
    "image_files = [\n",
    "    ('000007.png')\n",
    "]\n",
    "\n",
    "point_cloud_files =[\n",
    "    ('000007.bin'),\n",
    "]\n",
    "\n",
    "calib_files =[\n",
    "    ('000007.txt'),\n",
    "]\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Initialize configuration parameters required for point cloud pre-processing\n",
    "\n",
    "Input point cloud is segregated in 3D bins called as voxels.Features are computed for each voxels. pre-procssing step (voxelization) needs these configuration parameters\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "voxelization_config_params={}\n",
    "\n",
    "# voxel property \n",
    "voxelization_config_params['min_x'] = 0\n",
    "voxelization_config_params['max_x'] = 69.120\n",
    "voxelization_config_params['min_y'] = -39.680\n",
    "voxelization_config_params['max_y'] = 39.680\n",
    "voxelization_config_params['min_z'] = -3.0\n",
    "voxelization_config_params['max_z'] = 1.0\n",
    "voxelization_config_params['voxel_size_x']= 0.16\n",
    "voxelization_config_params['voxel_size_y']= 0.16\n",
    "voxelization_config_params['voxel_size_z']= 4.0\n",
    "voxelization_config_params['num_voxel_x'] = (voxelization_config_params['max_x'] - \n",
    "                                             voxelization_config_params['min_x'])/voxelization_config_params['voxel_size_x']\n",
    "\n",
    "voxelization_config_params['num_voxel_y'] = (voxelization_config_params['max_y'] - \n",
    "                                             voxelization_config_params['min_y'])/voxelization_config_params['voxel_size_y']\n",
    "\n",
    "# network property has to align with below parameters\n",
    "voxelization_config_params['max_points_per_voxel'] = 32\n",
    "voxelization_config_params['nw_max_num_voxels']    = 10000\n",
    "voxelization_config_params['num_feat_per_voxel']   = 10\n",
    "voxelization_config_params['num_channels']         = 64\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Pre-processing \n",
    "  - Point cloud data will have data from all around 360 degree, hence consider the point cloud \n",
    "    which are in front of camera view (`align_img_and_pc` API)\n",
    "  - Perform voxelization on a set of lidar data mentioned in the list 'point_cloud_files'. (`voxelization` API)\n",
    "  - Feature computation for each voxel. (`voxelization` API)\n",
    "\n",
    "### Inference\n",
    "  - Real TIDL inference \n",
    "  \n",
    "### Post-processing and Visualization\n",
    " - Object Detection models return results as a list (i.e. `numpy.ndarray`) with length of 9. Each element in this list contains, the detected object class ID, the probability of the detection and the 3d bounding box co-ordinates.\n",
    " - We use the `boxes3d_to_corners3d_lidar()` API to identify 8 corners of 3D box in 3 dimensional space.\n",
    " - `draw_lidar_bbox3d_on_img()` API is used for drawing lines between two corner\n",
    " - Then, in this notebook, we use *matplotlib* to plot the original images and the corresponding results.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tqdm\n",
    "import matplotlib.pyplot as plt\n",
    "from PIL import Image\n",
    "import urllib\n",
    "\n",
    "from scripts.utils_lidar import boxes3d_to_corners3d_lidar\n",
    "from scripts.utils_lidar import draw_lidar_bbox3d_on_img\n",
    "from scripts.utils_lidar import voxelization\n",
    "from scripts.utils_lidar import align_img_and_pc\n",
    "\n",
    "fig = plt.figure(figsize=(10,10))\n",
    "\n",
    "#Non Empty Voxel features. Populated by the `voxelization` API\n",
    "input0 = np.zeros((1, \n",
    "                   voxelization_config_params['num_feat_per_voxel'], \n",
    "                   voxelization_config_params['max_points_per_voxel'], \n",
    "                   voxelization_config_params['nw_max_num_voxels']),\n",
    "                  dtype='float32')\n",
    "\n",
    "#Voxel index in 2D convas. 2D convas where voxel features gets scattered. Populated by the `voxelization` API\n",
    "input1 = np.zeros((1, voxelization_config_params['num_channels'], \n",
    "                   voxelization_config_params['nw_max_num_voxels']),\n",
    "                  dtype='int32')\n",
    "\n",
    "#2D canvas initialized with zero\n",
    "input2 = np.zeros((1, voxelization_config_params['num_channels'], \n",
    "                   (int)(voxelization_config_params['num_voxel_x']*voxelization_config_params['num_voxel_y'])),\n",
    "                  dtype='float32')\n",
    "\n",
    "# In onnx rt flow double data read happens hence that can be discounted in DDR read measurement\n",
    "input_size_float = input0.size * input0.itemsize + input1.size * input1.itemsize + input2.size * input2.itemsize\n",
    "\n",
    "ax = []\n",
    "\n",
    "for num in tqdm.trange(len(image_files)):\n",
    "    image_file = image_files[num]\n",
    "    point_cloud_file = point_cloud_files[num]\n",
    "    calib_file       = calib_files[num]\n",
    "\n",
    "    # read int8 image data\n",
    "    req = urllib.request.urlopen('https://raw.githubusercontent.com/azureology/kitti-velo2cam/master/data_object_image_2/testing/image_2/' + image_file)\n",
    "    arr = np.asarray(bytearray(req.read()), dtype=np.int8)\n",
    "    img = cv2.imdecode(arr, -1) # 'Load it as it is'\n",
    "    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n",
    "\n",
    "    # read point float32 cloud data\n",
    "    req = urllib.request.urlopen('https://raw.githubusercontent.com/azureology/kitti-velo2cam/master/data_object_velodyne/testing/velodyne/' + point_cloud_file)\n",
    "    point_cloud_raw = np.fromstring(req.read(), dtype=np.float32).reshape(-1, 4)\n",
    "\n",
    "    # read calib text data\n",
    "    req = urllib.request.urlopen('https://raw.githubusercontent.com/azureology/kitti-velo2cam/master/testing/calib/' + calib_file)\n",
    "    calib_line = []\n",
    "    for line in req:\n",
    "        calib_line.append(line.decode(\"utf-8\"))\n",
    "\n",
    "    ### Pre-processing    \n",
    "    # point cloud data will have data from all around 360 degree, hence consider the point cloud \n",
    "    #which are in front of camera view\n",
    "    lidar_data, lidar2img_rt = align_img_and_pc(img, point_cloud_raw, calib_line)\n",
    "\n",
    "    voxelization(lidar_data,voxelization_config_params,input0[0],input1[0])\n",
    "    \n",
    "    ### Inference\n",
    "    output = sess.run(None, {input_details[0].name: input0, input_details[1].name: input1, input_details[2].name: input2})\n",
    "    selected_objs = output[0][0][0][(output[0][0][0][:,1]>0.3)]\n",
    "    ### Post-processing\n",
    "    corners = boxes3d_to_corners3d_lidar(selected_objs[:,2:])\n",
    "    img = draw_lidar_bbox3d_on_img(corners,img,lidar2img_rt)\n",
    "    \n",
    "    ax.append(fig.add_subplot(len(image_files), 1, num+1) )\n",
    "    plt.imshow(img)\n",
    "\n",
    "plt.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot Inference benchmarking statistics\n",
    " - During model execution several benchmarking statistics such as timestamps at different checkpoints, DDR bandwidth are collected and stored. \n",
    " - The `get_TI_benchmark_data()` function can be used to collect these statistics. The statistics are collected as a dictionary of `annotations` and corresponding markers.\n",
    " - We provide the utility function plot_TI_benchmark_data to visualize these benchmark KPIs.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "<b>Note:</b> The values represented by <i>Inferences Per Second</i> and <i>Inference Time Per Image</i> uses the total time taken by the inference except the time taken for copying inputs and outputs. In a performance oriented system, these operations can be bypassed by writing the data directly into shared memory and performing on-the-fly input / output normalization.\n",
    "</div>\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scripts.utils import plot_TI_performance_data, plot_TI_DDRBW_data, get_benchmark_output, print_soc_info\n",
    "stats = sess.get_TI_benchmark_data()\n",
    "fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(10,5))\n",
    "plot_TI_performance_data(stats, axis=ax)\n",
    "plt.show()\n",
    "\n",
    "tt, st, rb, wb = get_benchmark_output(stats)\n",
    "\n",
    "rb_exclude_float_ip = rb - input_size_float # saving because of onnxrt double input accounting\n",
    "\n",
    "print_soc_info()\n",
    "print(f'{selected_model_id.label} :')\n",
    "print(f' Inferences Per Second    : {1000.0/tt :7.2f} fps')\n",
    "print(f' Inference Time Per Image : {tt :7.2f} ms')\n",
    "print(f' DDR usage Per Lidar Frame: {rb+ wb : 7.2f} MB')\n",
    "print(f' DDR usage Per Lidar Frame excluding float input      : {rb_exclude_float_ip+ wb : 7.2f} MB')"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Tags",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
